Quality of alignment comparison by COMPASS improves with inclusion of diverse confident homologs

نویسندگان

  • Ruslan Sadreyev
  • Nick V. Grishin
چکیده

MOTIVATION Adding more distant homologs to a multiple alignment and thus increasing its diversity may eventually deteriorate the numerical profile constructed from this alignment. Here, we addressed the question whether such a diversity limit can be reached in the alignments of confident homologs found by PSI-BLAST, and we analyzed the dependence of the quality of the profile-profile comparison made by COMPASS on the sequence diversity within these alignments. RESULTS Protein families that have a greater number of diverse confident homologs in the current sequence databases provide an increased quality of similarity detection in profile databases, but produce on average less accurate profile-profile alignments with their remote relatives. This lower alignment accuracy cannot be improved when the most distant members of these families are excluded from their profiles. On the contrary, the presence of more diverse members results in more accurate alignments. For families with a high diversity of confident homologs, the lower quality of profile alignments with their remote relatives seems to be an attribute of these families or their alignments, rather than to be caused by the large number of diverse sequences itself. Our results suggest that at any level of profile diversity, one should include in the multiple alignment as many confident sequence homologs as possible in order to produce the most accurate results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Protein homology detection by HMM?CHMM comparison

MOTIVATION Protein homology detection and sequence alignment are at the basis of protein structure prediction, function prediction and evolution. RESULTS We have generalized the alignment of protein sequences with a profile hidden Markov model (HMM) to the case of pairwise alignment of profile HMMs. We present a method for detecting distant homologous relationships between proteins based on t...

متن کامل

COMPASS server for remote homology inference

COMPASS is a method for homology detection and local alignment construction based on the comparison of multiple sequence alignments (MSAs). The method derives numerical profiles from given MSAs, constructs local profile-profile alignments and analytically estimates E-values for the detected similarities. Until now, COMPASS was only available for download and local installation. Here, we present...

متن کامل

Effect of Objective Function on the Optimization of Highway Vertical Alignment by Means of Metaheuristic Algorithms

The main purpose of this work is the comparison of several objective functions for optimization of the vertical alignment. To this end, after formulation of optimum vertical alignment problem based on different constraints, the objective function was considered as four forms including: 1) the sum of the absolute value of variance between the vertical alignment and the existing ground; 2) the su...

متن کامل

COACH: profile-profile alignment of protein families using hidden Markov models

MOTIVATION Alignments of two multiple-sequence alignments, or statistical models of such alignments (profiles), have important applications in computational biology. The increased amount of information in a profile versus a single sequence can lead to more accurate alignments and more sensitive homolog detection in database searches. Several profile-profile alignment methods have been proposed ...

متن کامل

Profile-profile comparisons by COMPASS predict intricate homologies between protein families.

Recently we proposed a novel method of alignment-alignment comparison, COMPASS (the tool for COmparison of Multiple Protein Alignments with Assessment of Statistical Significance). Here we present several examples of the relations between PFAM protein families that were detected by COMPASS and that lead to the predictions of presently unresolved protein structures. We discuss relatively straigh...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 20 6  شماره 

صفحات  -

تاریخ انتشار 2004